The RWTH/UPB/FORTH System Combination for the 4th CHiME Challenge Evaluation

نویسندگان

  • Tobias Menne
  • Jahn Heymann
  • Anastasios Alexandridis
  • Kazuki Irie
  • Albert Zeyer
  • Markus Kitza
  • Pavel Golik
  • Ilia Kulikov
  • Lukas Drude
  • Ralf Schlüter
  • Hermann Ney
  • Reinhold Haeb-Umbach
  • Athanasios Mouchtaris
چکیده

This paper describes automatic speech recognition (ASR) systems developed jointly by RWTH, UPB and FORTH for the 1ch, 2ch and 6ch track of the 4th CHiME Challenge. In the 2ch and 6ch tracks the final system output is obtained by a Confusion Network Combination (CNC) of multiple systems. The Acoustic Model (AM) is a deep neural network based on Bidirectional Long Short-Term Memory (BLSTM) units. The systems differ by front ends and training sets used for the acoustic training. The model for the 1ch track is trained without any preprocessing. For each front end we trained and evaluated individual acoustic models. We compare the ASR performance of different beamforming approaches: a conventional superdirective beamformer [1] and an MVDR beamformer as in [2], where the steering vector is estimated based on [3]. Furthermore we evaluated a BLSTM supported Generalized Eigenvalue beamformer using NN-GEV [4]. The back end is implemented using RWTH’s open-source toolkits RASR [5], RETURNN [6] and rwthlm [7]. We rescore lattices with a Long Short-Term Memory (LSTM) based language model. The overall best results are obtained by a system combination that includes the lattices from the system of UPB’s submission [8]. Our final submission scored second in each of the three tracks of the 4th CHiME Challenge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Robust automatic speech recognition in adverse environments is a challenging task. We address the 4 CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation...

متن کامل

The RWTH 2009 quaero ASR evaluation system for English and German

In this work, the RWTH automatic speech recognition systems for English and German for the second Quaero evaluation campaign 2009 are presented. The systems are designed to transcribe web data, European parliament plenary sessions and broadcast news data. Another challenge in the 2009 evaluation is that almost no in-domain training data is provided and the test data contains a large variety of ...

متن کامل

The System Combination RWTH Aachen: SYSTRAN for the NTCIR-10 PatentMT Evaluation

This paper describes the joint submission by RWTH Aachen University and SYSTRAN in the Chinese-English Patent Machine Translation Task at the 10th NTCIR Workshop. We specify the statistical systems developed by RWTH Aachen University and the hybrid machine translation systems developed by SYSTRAN. We apply RWTH Aachen’s combination techniques to create consensus hypotheses from very different s...

متن کامل

A fragment-decoding plus missing-data imputation ASR system evaluated on the 2nd CHiME Challenge

This paper reports on our entry to the small-vocabulary, moving-talker track of the 2nd CHiME challenge. The system we employ is based on the one that we developed for the 1st CHiME challenge, the latest results of which are reported in (Ma and Barker, 2012). Our motivation is to benchmark the system on the new CHiME challenge and to measure the extent to which it is robust against speaker moti...

متن کامل

The RWTH machine translation system for IWSLT 2007

The RWTH system for the IWSLT 2007 evaluation is a combination of several statistical machine translation systems. The combination includes Phrase-Based models, a n-gram translation model and a hierarchical phrase model. We describe the individual systems and the method that was used for combining the system outputs. Compared to our 2006 system, we newly introduce a hierarchical phrase-based tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016